Goto

Collaborating Authors

 minimax localization


Minimax Localization of Structural Information in Large Noisy Matrices Poster: W055M. Kolar

Neural Information Processing Systems

Goal: De-noise and re-order rows/columns of the matrix to infer biclusters that are activated. SNR then, for any biclustering procedure, the probability of failure remains bounded away from zero by a constant. Note: These procedures do not achieve information theoretic lower bound.


Minimax Localization of Structural Information in Large Noisy Matrices

Neural Information Processing Systems

We consider the problem of identifying a sparse set of relevant columns and rows in a large data matrix with highly corrupted entries. This problem of identifying groups from a collection of bipartite variables such as proteins and drugs, biological species and gene sequences, malware and signatures, etc is commonly referred to as biclustering or co-clustering. Despite its great practical relevance, and although several ad-hoc methods are available for biclustering, theoretical analysis of the problem is largely non-existent. The problem we consider is also closely related to structured multiple hypothesis testing, an area of statistics that has recently witnessed a flurry of activity. We make the following contributions: i) We prove lower bounds on the minimum signal strength needed for successful recovery of a bicluster as a function of the noise variance, size of the matrix and bicluster of interest.


Minimax Localization of Structural Information in Large Noisy Matrices

Neural Information Processing Systems

We consider the problem of identifying a sparse set of relevant columns and rows in a large data matrix with highly corrupted entries. This problem of identifying groups from a collection of bipartite variables such as proteins and drugs, biological species and gene sequences, malware and signatures, etc is commonly referred to as biclustering or co-clustering. Despite its great practical relevance, and although several ad-hoc methods are available for biclustering, theoretical analysis of the problem is largely non-existent. The problem we consider is also closely related to structured multiple hypothesis testing, an area of statistics that has recently witnessed a flurry of activity. We make the following contributions: i) We prove lower bounds on the minimum signal strength needed for successful recovery of a bicluster as a function of the noise variance, size of the matrix and bicluster of interest.